System and method for speech-enabled automated agent assistance within a cloud-based contact center

ABSTRACT

Methods to reduce agent effort and improve customer experience quality through artificial intelligence. The Agent Assist tool provides contact centers with an innovative tool designed to reduce agent effort, improve quality and reduce costs by minimizing search and data entry tasks The Agent Assist tool is natively built and fully unified within the agent interface while keeping all data internally protected from third-party sharing.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Provisional Patent ApplicationNo. 62/870,913, filed Jul. 5, 2019, entitled “SYSTEM AND METHOD FORAUTOMATION WITHIN A CLOUD-BASED CONTACT CENTER,” which is incorporatedherein by reference in its entirety.

BACKGROUND

Today, contact centers are primarily on-premise software solutions. Thisrequires an enterprise to make a substantial investment in hardware,installation and regular maintenance of such solutions. Using on-premisesoftware, agents and supervisors are stationed in an on-site callcenter. In addition, a dedicated IT staff is required because on-sitesoftware may be too complicated for supervisors and agents to handle ontheir own. Another drawback of on-premise solutions is that suchsolutions cannot be easily enhanced to include capabilities to that meetthe current demands of technology, such as automation. Thus, there is aneed for a solution to enhance the agent experience to enhance theinteractions with customers who interact with contact centers.

SUMMARY

Disclosed herein are systems and methods for providing a cloud-basedcontact center solution providing agent automation through the use ofe.g., artificial intelligence and the like.

In accordance with an aspect, there is disclosed a method, comprisingreceiving a speech communication from a customer; converting the speechcommunication to text to perform inference processing on the text todetermine a customer intent; automatically analyzing the text todetermine a subject of the speech communication and key terms associatedwith the subject; automatically parsing a knowledgebase using the keyterms for at least one responsive answer associated with the subject;and providing the solution to an agent in a unified interface during thecommunication with the customer. In accordance with another aspect, acloud-based software platform is disclosed in which the example methodabove is performed.

Other systems, methods, features and/or advantages will be or may becomeapparent to one with skill in the art upon examination of the followingdrawings and detailed description. It is intended that all suchadditional systems, methods, features and/or advantages be includedwithin this description and be protected by the accompanying claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The components in the drawings are not necessarily to scale relative toeach other. Like reference numerals designate corresponding partsthroughout the several views.

FIG. 1 illustrates an example environment;

FIG. 2 illustrates example component that provide automation, routingand/or omnichannel functionalities within the context of the environmentof FIG. 1;

FIG. 3 illustrates a high-level overview of interactions, components andflow of Agent Assist in accordance with the present disclosure;

FIG. 4 illustrates an example operational flow in accordance with thepresent disclosure and provides additional details of the high-leveloverview shown in FIG. 3;

FIGS. 5A, 5B and 5C illustrate an example unified interface showingaspects of the operational flows of FIGS. 3 and 4;

FIG. 6 illustrates an operational flow to analyze a conversation tocreate smart notes;

FIG. 7 illustrates an example smart notes user interface;

FIG. 8 illustrates an operational flow to analyze a conversation topre-populate forms;

FIG. 9 illustrates an example automatic scheduling user interface;

FIG. 10 illustrates an overview of the real-time analytics aspect ofAgent

Assist;

FIG. 11 illustrates an example operational flow to classify agentconversations;

FIG. 12 illustrates an example operational flow of escalationassistance; and

FIG. 13 illustrates an example computing device.

DETAILED DESCRIPTION

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art. Methods and materials similar or equivalent to those describedherein can be used in the practice or testing of the present disclosure.While implementations will be described within a cloud-based contactcenter, it will become evident to those skilled in the art that theimplementations are not limited thereto.

The present disclosure is generally directed to a cloud-based contactcenter and, more particularly, methods and systems for provingintelligent, automated services within a cloud-based contact center.With the rise of cloud-based computing, contact centers that takeadvantage of this infrastructure are able to quickly add new featuresand channels. Cloud-based contact centers improve the customerexperience by leveraging application programming interfaces (APIs) andsoftware development kits (SDKs) to allow the contact center to changein in response to an enterprise's needs. For example, communicationschannels may be easily added as the APIs and SDKs enable addingchannels, such as SMS/MMS, social media, web, etc. Cloud-based contactcenters provide a platform that enables frequent updates. Yet anotheradvantage of cloud-based contact centers is increased reliability, ascloud-based contact centers may be strategically and geographicallydistributed around the world to optimally route calls to reduce latencyand provide the highest quality experience. As such, customers areconnected to agents faster and more efficiently.

Example Cloud-Based Contact Center Architecture

FIG. 1 is an example system architecture 100, and illustrates examplecomponents, functional capabilities and optional modules that may beincluded in a cloud-based contact center infrastructure solution.Customers 110 interact with a contact center 150 using voice, email,text, and web interfaces in order to communicate with agent(s) 120through a network 130 and one or more channels 140. The agent(s) 120 maybe remote from the contact center 150 and handle communications withcustomers 110 on behalf of an enterprise or other entity. The agent(s)120 may utilize devices, such as but not limited to, work stations,desktop computers, laptops, telephones, a mobile smartphone and/or atablet. Similarly, customers 110 may communicate using a plurality ofdevices, including but not limited to, a telephone, a mobile smartphone,a tablet, a laptop, a desktop computer, or other. For example, telephonecommunication may traverse networks such as a public switched telephonenetworks (PSTN), Voice over Internet Protocol (VoIP) telephony (via theInternet), a Wide Area Network (WAN) or a Large Area Network. Thenetwork types are provided by way of example and are not intended tolimit types of networks used for communications.

The contact center 150 may be cloud-based and distributed over aplurality of locations. The contact center 150 may include servers,databases, and other components. In particular, the contact center 150may include, but is not limited to, a routing server, a SIP server, anoutbound server, automated call distribution (ACD), a computer telephonyintegration server (CTI), an email server, an IM server, a socialserver, a SMS server, and one or more databases for routing, historicalinformation and campaigns.

The routing server may serve as an adapter or interface between theswitch and the remainder of the routing, monitoring, and othercommunication-handling components of the contact center. The routingserver may be configured to process PSTN calls, VoIP calls, and thelike. For example, the routing server may be configured with the CTIserver software for interfacing with the switch/media gateway andcontact center equipment. In other examples, the routing server mayinclude the SIP server for processing SIP calls. The routing server mayextract data about the customer interaction such as the caller'stelephone number (often known as the automatic number identification(ANI) number), or the customer's internet protocol (IP) address, oremail address, and communicate with other contact center components inprocessing the interaction.

The ACD is used by inbound, outbound and blended contact centers tomanage the flow of interactions by routing and queuing them to the mostappropriate agent. Within the CTI, software connects the ACD to aservicing application (e.g., customer service, CRM, sales, collections,etc.), and looks up or records information about the caller. CTI maydisplay a customer's account information on the agent desktop when aninteraction is delivered.

For inbound SIP messages, the routing server may use statistical datafrom the statistics server and a routing database to the route SIPrequest message. A response may be sent to the media server directing itto route the interaction to a target agent 120. The routing database mayinclude: customer relationship management (CRM) data; data pertaining toone or more social networks (including, but not limited to networkgraphs capturing social relationships within relevant social networks,or media updates made by members of relevant social networks); agentskills data; data extracted from third party data sources includingcloud-based data sources such as CRM; or any other data that may beuseful in making routing decisions.

Customers 110 may initiate inbound communications (e.g., telephonycalls, emails, chats, video chats, social media posts, etc.) to thecontact center 150 via an end user device. End user devices may be acommunication device, such as, a telephone, wireless phone, smart phone,personal computer, electronic tablet, etc., to name some non-limitingexamples. Customers 110 operating the end user devices may initiate,manage, and respond to telephone calls, emails, chats, text messaging,web-browsing sessions, and other multi-media transactions. Agent(s) 120and customers 110 may communicate with each other and with otherservices over the network 130. For example, a customer calling ontelephone handset may connect through the PSTN and terminate on aprivate branch exchange (PBX). A video call originating from a tabletmay connect through the network 130 terminate on the media server. Thechannels 140 are coupled to the communications network 130 for receivingand transmitting telephony calls between customers 110 and the contactcenter 150. A media gateway may include a telephony switch orcommunication switch for routing within the contact center. The switchmay be a hardware switching system or a soft switch implemented viasoftware. For example, the media gateway may communicate with anautomatic call distributor (ACD), a private branch exchange (PBX), anIP-based software switch and/or other switch to receive Internet-basedinteractions and/or telephone network-based interactions from a customer110 and route those interactions to an agent 120. More detail of theseinteractions is provided below.

As another example, a customer smartphone may connect via the WAN andterminate on an interactive voice response (IVR)/intelligent virtualagent (IVA) components. IVR are self-service voice tools that automatethe handling of incoming and outgoing calls. Advanced IVRs use speechrecognition technology to enable customers 110 to interact with them byspeaking instead of pushing buttons on their phones. IVR applicationsmay be used to collect data, schedule callbacks and transfer calls tolive agents. IVA systems are more advanced and utilize artificialintelligence (AI), machine learning (ML), advanced speech technologies(e.g., natural language understanding (NLU)/natural language processing(NLP)/natural language generation (NLG)) to simulate live andunstructured cognitive conversations for voice, text and digitalinteractions. IVA systems may cover a variety of media channels inaddition to voice, including, but not limited to social media, email,SMS/MMS, IM, etc. and they may communicate with their counterpart'sapplication (not shown) within the contact center 150. The IVA systemmay be configured with a script for querying customers on their needs.The IVA system may ask an open-ended questions such as, for example,“How can I help you?” and the customer 110 may speak or otherwise entera reason for contacting the contact center 150. The customer's responsemay then be used by a routing server to route the call or communicationto an appropriate contact center resource.

In response, the routing server may find an appropriate agent 120 orautomated resource to which an inbound customer communication is to berouted, for example, based on a routing strategy employed by the routingserver, and further based on information about agent availability,skills, and other routing parameters provided, for example, by thestatistics server. The routing server may query one or more databases,such as a customer database, which stores information about existingclients, such as contact information, service level agreementrequirements, nature of previous customer contacts and actions taken bycontact center to resolve any customer issues, etc. The routing servermay query the customer information from the customer database via an ANIor any other information collected by the IVA system.

Once an appropriate agent and/or automated resource is identified asbeing available to handle a communication, a connection may be madebetween the customer 110 and an agent device of the identified agent 120and/or the automate resource. Collected information about the customerand/or the customer's historical information may also be provided to theagent device for aiding the agent in better servicing the communication.In this regard, each agent device may include a telephone adapted forregular telephone calls, VoIP calls, etc. The agent device may alsoinclude a computer for communicating with one or more servers of thecontact center and performing data processing associated with contactcenter operations, and for interfacing with customers via voice andother multimedia communication mechanisms.

The contact center 150 may also include a multimedia/social media serverfor engaging in media interactions other than voice interactions withthe end user devices and/or other web servers 160. The mediainteractions may be related, for example, to email, vmail (voice mailthrough email), chat, video, text-messaging, web, social media,co-browsing, etc. In this regard, the multimedia/social media server maytake the form of any IP router conventional in the art with specializedhardware and software for receiving, processing, and forwardingmulti-media events.

The web servers 160 may include, for example, social media sites, suchas, Facebook, Twitter, Instagram, etc. In this regard, the web servers160 may be provided by third parties and/or maintained outside of thecontact center 160 that communicate with the contact center 150 over thenetwork 130. The web servers 160 may also provide web pages for theenterprise that is being supported by the contact center 150. End usersmay browse the web pages and get information about the enterprise'sproducts and services. The web pages may also provide a mechanism forcontacting the contact center, via, for example, web chat, voice call,email, WebRTC, etc.

The integration of real-time and non-real-time communication servicesmay be performed by unified communications (UC)/presence sever.Real-time communication services include Internet Protocol (IP)telephony, call control, instant messaging (IM)/chat, presenceinformation, real-time video and data sharing. Non-real-timeapplications include voicemail, email, SMS and fax services. Thecommunications services are delivered over a variety of communicationsdevices, including IP phones, personal computers (PCs), smartphones andtablets. Presence provides real-time status information about theavailability of each person in the network, as well as their preferredmethod of communication (e.g., phone, email, chat and video).

Recording applications may be used to capture and play back audio andscreen interactions between customers and agents. Recording systemsshould capture everything that happens during interactions and whatagents do on their desktops. Surveying tools may provide the ability tocreate and deploy post-interaction customer feedback surveys in voiceand digital channels. Typically, the IVR/IVA development environment isleveraged for survey development and deployment rules.Reporting/dashboards are tools used to track and manage the performanceof agents, teams, departments, systems and processes within the contactcenter.

Automation

As shown in FIG. 1, automated services may enhance the operation of thecontact center 150. In one aspect, the automated services may beimplemented as an application running on a mobile device of a customer110 , one or more cloud computing devices (generally labeled automationservers 170 connected to the end user device over the network 130), oneor more servers running in the contact center 150 (e.g., automationinfrastructure 200), or combinations thereof.

With respect to the cloud-based contact center, FIG. 2 illustrates anexample automation infrastructure 200 implemented within the cloud-basedcontact center 150. The automation infrastructure 200 may automaticallycollect information from a customer 110 user through, e.g., a userinterface/voice interface 202, where the collection of information maynot require the involvement of a live agent. The user input may beprovided as free speech or text (e.g., unstructured, natural languageinput). This information may be used by the automation infrastructure200 for routing the customer 110 to an agent 120, to automated resourcesin the contact center 150, as well as gathering information from othersources to be provided to the agent 120. In operation, the automationinfrastructure 200 may parse the natural language user input using anatural language processing module 210 to infer the customer's intentusing an intent inference module 212 in order to classify the intent.Where the user input is provided as speech, the speech is transcribedinto text by a speech-to-text system 206 (e.g., a large vocabularycontinuous speech recognition or LVCSR system) as part of the parsing bythe natural language processing module 210. The communication manager204 monitors user inputs and presents notifications within the userinterface/voice interface 202. Responses by the automationinfrastructure 200 to the customer 110 may be provided as speech usingthe text-to-speech system 208.

The intent inference module automatically infers the customer's 110intent from the text of the user input using artificial intelligence ormachine learning techniques. These artificial intelligence techniquesmay include, for example, identifying one or more keywords from the userinput and searching a database of potential intents (e.g., call reasons)corresponding to the given keywords. The database of potential intentsand the keywords corresponding to the intents may be automatically minedfrom a collection of historical interaction recordings, in which acustomer may provide a statement of the issue, and in which the intentis explicitly encoded by an agent.

Some aspects of the present disclosure relate to automaticallynavigating an IVR system of a contact center on behalf of a user using,for example, the loaded script. In some implementations of the presentdisclosure, the script includes a set of fields (or parameters) of datathat are expected to be required by the contact center in order toresolve the issue specified by the customer's 110 intent. In someimplementations of the present disclosure, some of the fields of dataare automatically loaded from a stored user profile. These stored fieldsmay include, for example, the customer's 110 full name, address,customer account numbers, authentication information (e.g., answers tosecurity questions) and the like.

Some aspects of the present disclosure relate to the automaticauthentication of the customer 110 with the provider. For example, insome implementations of the present disclosure, the user profile mayinclude authentication information that would typically be requested ofusers accessing customer support systems such as usernames, accountidentifying information, personal identification information (e.g., asocial security number), and/or answers to security questions. Asadditional examples, the automation infrastructure 200 may have accessto text messages and/or email messages sent to the customer's 110account on the end user device in order to access one-time passwordssent to the customer 110, and/or may have access to a one-time password(OTP) generator stored locally on the end user device. Accordingly,implementations of the present disclosure may be capable ofautomatically authenticating the customer 110 with the contact centerprior to an interaction.

In some implementations of the present disclosure an applicationprogramming interface (API) is used to interact with the providerdirectly. The provider may define a protocol for making commonplacerequests to their systems. This API may be implemented over a variety ofstandard protocols such as Simple Object Access Protocol (SOAP) usingExtensible Markup Language (XML), a Representational State Transfer(REST) API with messages formatted using XML or JavaScript ObjectNotation (JSON), and the like. Accordingly, a customer experienceautomation system 200 according to one implementation of the presentdisclosure automatically generates a formatted message in accordancewith an API define by the provider, where the message contains theinformation specified by the script in appropriate portions of theformatted message.

Some aspects of the present disclosure relate to systems and methods forautomating and augmenting aspects of an interaction between the customer110 and a live agent of the contact center. In an implementation, once ainteraction, such as through a phone call, has been initiated with theagent 120, metadata regarding the conversation is displayed to thecustomer 110 and/or agent 120 in the UI throughout the interaction.Information, such as call metadata, may be presented to the customer 110through the UI 205 on the customer's 110 mobile device 105. Examples ofsuch information might include, but not be limited to, the provider,department call reason, agent name, and a photo of the agent.

According to some aspects of implementations of the present disclosure,both the customer 110 and the agent 120 can share relevant content witheach other through the application (e.g., the application running on theend user device). The agent may share their screen with the customer 110or push relevant material to the customer 110.

In yet another implementation, the automation infrastructure 200 mayalso “listen” in on the conversation and automatically push relevantcontent from a knowledge base to the customer 110 and/or agent 120. Forexample, the application may use a real-time transcription of thecustomer's input (e.g., speech) to query a knowledgebase to provide asolution to the agent 120. The agent may share a document describing thesolution with the customer 110. The application may include severallayers of intelligence where it gathers customer intelligence to learneverything it can about why the customer 110 is calling. Next, it mayperform conversation intelligence, which is extracting more contextabout the customer's intent. Next, it may perform interactionintelligence to pull information from other sources about customer 100.The automation infrastructure 200 may also perform contact centerintelligence to implement WFM/WFO features of the contact center 150.

Agent Assist Overview

Thus, in the context of FIGS. 1-2, the present disclosure providesimprovements by providing an innovative tool to reduce agent effort andimprove customer experience quality through artificial intelligence(referred to herein as “Agent Assist”). Agent Assist is an innovativetool used within e.g., contact centers, designed to reduce agent effort,improve quality and reduce costs by minimizing search and data entrytasks Agent Assist is fully unified within the agent interface whilekeeping all data internally protected from third-party sharing. AgentAssist improve quality and reduce costs by minimizing search and dataentry tasks through the use of Al capabilities. Agent Assist simplifiesagent effort and improves Customer Satisfaction/Net Promoter ScoreCSAT/NPS.

Agent Assist is powered by artificial intelligence (Al) to providereal-time guidance for frontline employees to respond to customer needsquickly and accurately. For example, as a customer 110 states a need,agents 120 are provided answers or supporting information immediately toexpedite the conversation and simplify tasks. Agent Assist determineswhy customers are calling and what their intent is. Similarly, IVRassist makes recommendations to a supervisor to optimize IVR for abetter customer experience, for example, Agent Assist helps optimize IVRquestions to match customers' reasons for calling and what their intentis.

By leveraging automated assistance and reducing agent-supervisor ad-hocinteractions, Agent Assist gives supervisors more time to focus onworkforce engagement activities. Agent Assist reduces manual supervisionand assistance. Agent Assist improves agent proficiency and accuracy.Agent Assist reduces short and long term training efforts throughreal-time error identification, eliminates busy work with smart notetechnology (the ability to systematically recognize and enter all keyaspects of an interaction into the conversation notes); and improvedhandle time with in-app automations.

With reference to FIG. 3, there is illustrated a high-level overview ofinteractions, components and flow of Agent Assist in accordance with thepresent disclosure. In operation, a customer 110 will contact thecloud-based contact center 150 through one or more of the channels 140.as shown in FIG. 1. The agent 120 to whom the customer 110 is routed maylisten to the customer 110 while the same time the Agent Assistfunctionality pulls information using a knowledge graph engine 308. Theknowledge graph engine 312 gathers information from one or more of aknowledgebase 302, a customer relationship management (CRM) platform/acustomer service management (CSM) platform 304, and/or conversationaltranscripts 306 of other agent conversations to provide contextuallyrelevant information to the agent. Additionally, information capturedwithin the agent interface (see, FIGS. 5A-5C, 7 and 13) can beautomatically added to account profiles or work item tickets, within theCRM, without any additional agent effort. Agent Assist is an intelligentadvisory tool which supplies data-driven real-time recommendations, nextbest actions and automations to aid agents in customer interactions andguide them to quality and outcome excellency. This may include makingrecommendations based on interactions, discussions and monitored KPIs.Agent Assist helps match agent skill to the reasons why customers arecalling. In addition, information may be provided to the agent fromthird-party sources via the web servers 160 (e.g., knowledge bases ofproduct manufacturers) or social media platforms.

With reference to FIG. 4, there is illustrated an example operationalflow 400 in accordance with the present disclosure, and providesadditional details of the high-level overview shown in FIG. 3. At 402,the process begins wherein the system listens the customer and agentvoices as they speak (S. 404). For example, the automationinfrastructure 200 may process the customer speech, as described withregard to FIG. 2. At 406, the agent voice is separated from the customervoice into their own respective channels. Once separated, at 408,unsupervised methods may be used to automatically perform one or more ofthe following non-limiting processes: apply biometrics to authenticatethe caller/customer, predict a caller gender, predict a caller agecategory, predict a caller accent, and/or predict caller otherdemographics. Optionally or alternatively, if speaker separation is notperformed at 406, then the system may distinguish between the customerand the agent by analyzing time that either the agent or the customertalks or listens, identify signature of agent voice or user voice, orapply non-supervised methods to separate user and agent voice inreal-time.

The operational flow continues at 410, wherein the customer voice and/oragent voice may be analyzed before transcription to extract one or moreof the following non-limiting features:

Pain

Agony

Empathy

Being sarcastic

Speech speed

Tone

Frustration

Enthusiasm

Interest

Engagement

Understanding these features helps the agent 120 better understand thecustomer 110. The agent 120 will be better able to understand thecustomer's problem or issues so a resolution can be more easilyachieved.

At 412, the conversation between the agent and the customer istranscribed in either real-time or post-call. This may be performed bythe speech-to-text component of the automation infrastructure 200 andsaved to a database. At 414, the agent voice channel and the customervoice channel are separated. At 416, the automation infrastructure 200determines information about the customer and agent, such as, intent,entities (e.g., names, locations, times, etc.) sentiment, sentencephrases (e.g. verb, noun, adjective, etc.). At 418, from the informationdetermined at 416, Agent Assist provides useful insight to the agent120. This information, as shown in FIG. 3, may be information retrievedfrom the relevant CRM, the most relevant documents in the relatedknowledge base, and/or a relevant conversation and interaction thatoccurred in the past that was related to a similar topic or otherfeature of the interaction between the agent and the customer.Information pulled from the knowledgebase may be highlighted to theagent in a display, such as shown in FIGS. 5A-5C, 7 and 13.

Thus, in accordance with the operational flow of FIG. 4, Agent Assistprovides real-time guidance for frontline employees to respond tocustomer needs quickly and accurately. As a customer 110 states his orher need, agents 120 will be delivered answers or supporting informationimmediately to expedite the conversation and simplify agent effort. Bydelivering information from CRM 304 or knowledgebase 302 to the agent120 in milliseconds, agent handling time will handle be reduced andcustomers will realize a time savings and ultimately a reduction ineffort to interact with businesses.

FIGS. 5A-5C illustrate an example unified interface 500 showing aspectsof the operational flows of FIGS. 3 and 4. In FIGS. 5A-5C, the agent 120is speaking on behalf of a financial institution. The agent 120 could bespeaking on behalf of any entity for which the cloud-based contactcenter 150 serves. As shown in FIG. 5A, the customer 110 is calling toask questions about setting up a retirement plan. Because the context ofthe conversation is understood by the automation infrastructure 200 tobe related to a financial institution, Agent Assist identifies that theterm “retirement plan” is meaningful and highlights it to the agent. Asshown in FIG. 5B, Agent Assist provides a prompt 502 indicating to theagent 120 that there are many different types of retirement plans thatthe customer 110 can choose from. A button or other control 504 isprovided such that the agent 120 can click a link to see moreinformation. The link to the information may provide text, audio, video,messages, tweets, posts, etc. to the agent 120. Agent Assist provides asegment and/or snippet in the text that is relevant to the customer'sneeds. In other implementations, Agent Assist provides a relevantinteraction in the past (e.g., a similar call with a similar issue thatagent 120 was able to address, etc.) or provide cross channelinformation (e.g., find a most relevant e-mail for a call, etc.). Asshown in FIG. 5C, Agent Assist may provide an option 506 to schedule ameeting or call between the customer 110 and a financial planner (i.e.,a person with additional knowledge within the entity who may satisfy thecustomer's request to the agent 120). Additional details of thescheduling operation are described below with reference to FIG. 8.

Smart Notes

FIGS. 6 and 7 provide details about the smart notes feature of AgentAssist. The smart notes feature may be used by the agent 120 tosummarize a conversation with the customer 120, extract relevantportions of the interaction, etc. Important items in the smart notes maybe highlighted using bold fonts or other. The process begins at 602where operations 404-414 are performed. These may be performed inparallel with the other features described above. At 604, information isextracted from the transcript and populated into the smart notes. Asshown in FIG. 7, a call notes user interface 702 is provided to theagent 120 with information from the call with the customer 110pre-populated in an input field 704. For example, in the context of aretailer, the phrases “status of my last order” and “place a new order”may be determined to be relevant information by the automationinfrastructure 200, and is populated into the call notes input field704. At 606, important terms may be highlighted. At 608, the processends. As shown in FIG. 7, the call notes user interface 702 may providean option for the user to edit and/or add notes.

In accordance with the operations performed in FIG. 6, Agent Assist mayanalyze the conversation between the agent 120 and the customer 110 tocreate smart notes. This conversation could be a phone call, a textmessage, chat or video call, etc. Smart notes extracts the most relevantinformation from this conversation. For instance after a conversation,Agent Assist may determine that the discussion between the agent and thecustomer was about “canceling an old order ” and “ placing a new order.”These would be extracted as Smart Notes and provide to the agent, whohas an option to accept or modify the note, as show in FIG. 7. Toachieve the above, Agent Assist may separate the conversation betweencustomer 110 and agent 120 to find words and phrases that are commonbetween agents and customers, when a customer confirms a question, orwhen an agent confirms what customer says. For instance, the agent 120may say, “Ok, so you would like to place a new order—correct?” In thiscase, the Smart Note would be a summary of the call about placing a neworder.

Automatic Data Entry

In accordance with aspects of the disclosure, when Agent Assist detectsthe participants in a conversation it may automatically fill out anyforms that pop-up after such conversations. With reference to FIG. 8,the process begins at 802 where operations 404-414 are performed. Thesemay be performed in parallel with the other features described above. At804, information is extracted to populate forms. As shown in FIG. 9, inresponse to the customer indicating that he or she is calling to moveforward on a job application, scheduling information may be presented tothe agent in a field 508. This information may populate into userinterface (902) field 904 together with additional information in field906 to schedule the call for an interview with the appropriate person.In another example, if the person says, “Hi my name is John? I like toreturn my iPhone 6,” a form may pop up with some of the information suchas Name: John and Phone: iPhone 6 prefilled into the form.

Such automated data entry includes but not limited to:

Date

Time

Day of the week

First name

Last name

Gender

Address

Object e.g., Samsung Galaxy

Type of the Object—e.g. Galaxy S9

Time of the day (e.g. morning, afternoon)

After the information is populated, the process ends at 806.

Real-time analytics and error detection

With reference to FIG. 10, Agent Assist may provide for real-timeanalytics and error detection by monitoring a conversation (i.e., acall, a text, an e-mail, video, chat, etc.) between the customer 110 andagent 120 in real-time to detect the following non-limiting categories::

Compliance—words that should not say in the conversation.

Competitors—if agent says the name of competitors.

A set of “do's and don'ts”—words that agent should not say.

If the agent is angry, curse etc.

If the agent is making fun of the caller.

If the agent talks too fast, too slow, or if there is a delay betweenwords.

If the agent shows empathy.

If the agent violates any policy.

If the agent markets other products.

If the agent talks about personal issues.

If the agent is politically motivated.

If the agent promotes violence.

The process monitors the agent in real-time and expands upon the currentstate of the art, which is monitoring is at word level to monitor thetranscript of the conversation and look for certain words or a variationof such words. For instance, if the agent is talking about pricing, thesystem may look for words such as “our pricing.” “our price list,” “doyou want to know how much our product is,” etc. As another example, theagent may say “our product is beating everybody else,” which means theprice is very affordable. Other examples such as these are possible.

Artificial Intelligence (AI) Processing/Learning

In accordance with the present disclosure, a layer of deep learning 1002is applied to create a large set of all potential of sentences andinstances (natural language understanding 1004) where the agent:

Said X and meant A.

Said Y and meant A.

Said Z but did not mean A.

Said W and meant B.

This sets have several positive and negative examples around concepts,such as “cursing,” “being frustrated,” “rude attitude,” “too pushy forsale,” “soft attitude,” as well as word level examples, such as “shutup.” Deep learning 1002 does not need to extract features, rather deeplearning takes a set of sentences and classes (class ispositive/negative, good bad, cursing/not cursing). Deep learning 1002learns and builds a model out of all of these examples. For example,audio files of conversations 1006 between agents 120 and customers 110may be input to the deep learning module 1002. Alternatively,transcribed words may be input to the deep learning module 1002. Next,the system uses the learned model to listen to any conversation in realtime and to identify the class such “cursing/not cursing.” As soon asthe system identifies a class, and if it is negative or positive, it cando the following:

Send an alert to manager

Make an indicator red on the screen

Send a note to the agent to be reviewed in real-time or after the call

Update some data files for reporting and visualization.

As part of the above, the natural language understanding 1004 may beused for intent spotting 1008 to determine intent 1010, which may beused for IVR analysis 1012 and/or agent performance 1014.

In this approach words are not important, rather the combination of allof words, the order of words and al potential variations of them haverelevance. Deep learning 1002 considers all of the potential signalsthat could describe and hint toward a class. This approach is alsolanguage agnostic. It does not matter what language agent or callerspeaks as long as there are a set of words and a set of classes, deeplearning 1002 will learn and the model can be applied to the samelanguage. In addition to the above, metadata may be added to every call,such as the time of the call, the duration of the call, the number oftimes the agent talked over the caller could be added to the data, etc.

Listening to Other Agents Conversation in Real-Time

As described above, Agent Assist may periodically perform the followingto classify conversations of other agents. With reference to FIG. 11,the process begins at 1102. At 1104, a feature vector of a conversationis created. Such feature vector(s) includes but are not limited to:

Time of the call

Duration of the call

Topic of the call

Frequency of words in the customer transcription (e.g. Ticket 2, Delay4, etc.)

Frequency of words in the agent transcription (e.g. rebook 3, etc.)

Cluster conversations based on these features

At 1106, for the conversation happening in within a predetermined period(e.g., one month), the following are performed:

Calculate the point wise mutual information between all of the calls inone cluster

Make a graph of all calls in which the strength of the link is theweight of the point wise mutual information.

At 1108, for the current file:

Extract features

Find the cluster

Calculate the point wise mutual information

Find the closest call to the current call

Show the content of the call to the agent.

At 1110, the process ends.

Learning Module

While the process 1100 analyzes calls, Agent Assist learns and improvesby analyzing user clicks. As relevant conversations are presented to theagent (see, e.g., 306), if the agent clicks on a conversation and spendstime on it, then it means that the conversation is relevant. Further, ifthe conversation is located, e.g., third on the list, but the agentclicks on the first conversation and moves forward, Agent Assist doesnot make any assumptions about the conversation. Hence, the rank of theconversation may be of importance depending on the agent's actions. Forthe sake of simplicity, Agent Assist shows the top three conversationsto the agent. If some conversations ranked equally, Agent Assist picksone based on heuristics, for instance any conversation that has not beenpicked recently will be picked.

Escalation Assistance

With reference to FIG. 12, there is shown an example operational flow ofescalation assistance, which may occur when agent cannot answer acustomer question or when user is frustrated. With escalationassistance, agent can transfer the call to his or her supervisor, wherethe transfer will include a summary of the call, along highlights ofimportant notes. In this case, the supervisor has insight into thecontext and reason for the transfer, and the caller does not need torepeat the case over again. The process begins at 1202 where operations404-414 are performed. These may be performed in parallel with the otherfeatures described above. At 1204, information extracted is from thetranscript and populated into the smart notes with a call summary. At1206, notable items may be highlighted. At 1208, the customer istransferred to the supervisor, where the supervisor is fully briefed onthe reasons for the transfer. At 1210, the process ends.

Thus, the present disclosure described an Agent Assist tool within acloud-based contact center environment that is a conversational guidethat proactively delivers real-time contextualized next best actions,in-app, to enhance the customer and agent experience. Talkdesk AgentAssist uses Al to empower agents with a personalized assistant thatlistens, learns and provides intelligent recommendations in everyconversation to help resolve complex customer issues faster

General Purpose Computer Description

FIG. 13 shows an exemplary computing environment in which exampleembodiments and aspects may be implemented. The computing systemenvironment is only one example of a suitable computing environment andis not intended to suggest any limitation as to the scope of use orfunctionality.

Numerous other general purpose or special purpose computing systemenvironments or configurations may be used. Examples of well-knowncomputing systems, environments, and/or configurations that may besuitable for use include, but are not limited to, personal computers,servers, handheld or laptop devices, multiprocessor systems,microprocessor-based systems, network personal computers (PCs),minicomputers, mainframe computers, embedded systems, distributedcomputing environments that include any of the above systems or devices,and the like.

Computer-executable instructions, such as program modules, beingexecuted by a computer may be used. Generally, program modules includeroutines, programs, objects, components, data structures, etc. thatperform particular tasks or implement particular abstract data types.Distributed computing environments may be used where tasks are performedby remote processing devices that are linked through a communicationsnetwork or other data transmission medium. In a distributed computingenvironment, program modules and other data may be located in both localand remote computer storage media including memory storage devices.

With reference to FIG. 13, an exemplary system for implementing aspectsdescribed herein includes a computing device, such as computing device1300. In its most basic configuration, computing device 1300 typicallyincludes at least one processing unit 1302 and memory 1304. Depending onthe exact configuration and type of computing device, memory 1304 may bevolatile (such as random access memory (RAM)), non-volatile (such asread-only memory (ROM), flash memory, etc.), or some combination of thetwo. This most basic configuration is illustrated in FIG. 13 by dashedline 1306.

Computing device 1300 may have additional features/functionality. Forexample, computing device 1300 may include additional storage (removableand/or non-removable) including, but not limited to, magnetic or opticaldisks or tape. Such additional storage is illustrated in FIG. 13 byremovable storage 1308 and non-removable storage 1310.

Computing device 1300 typically includes a variety of tangible computerreadable media. Computer readable media can be any available tangiblemedia that can be accessed by device 1300 and includes both volatile andnon-volatile media, removable and non-removable media.

Tangible computer storage media include volatile and non-volatile, andremovable and non-removable media implemented in any method ortechnology for storage of information such as computer readableinstructions, data structures, program modules or other data. Memory1304, removable storage 1308, and non-removable storage 1310 are allexamples of computer storage media. Tangible computer storage mediainclude, but are not limited to, RAM, ROM, electrically erasable programread-only memory (EEPROM), flash memory or other memory technology,CD-ROM, digital versatile disks (DVD) or other optical storage, magneticcassettes, magnetic tape, magnetic disk storage or other magneticstorage devices, or any other medium which can be used to store thedesired information and which can be accessed by computing device 1300.Any such computer storage media may be part of computing device 1300.

Computing device 1300 may contain communications connection(s) 1312 thatallow the device to communicate with other devices. Computing device1300 may also have input device(s) 1314 such as a keyboard, mouse, pen,voice input device, touch input device, etc. Output device(s) 1316 suchas a display, speakers, printer, etc. may also be included. All thesedevices are well known in the art and need not be discussed at lengthhere.

It should be understood that the various techniques described herein maybe implemented in connection with hardware or software or, whereappropriate, with a combination of both. Thus, the methods and apparatusof the presently disclosed subject matter, or certain aspects orportions thereof, may take the form of program code (i.e., instructions)embodied in tangible media, such as floppy diskettes, CD-ROMs, harddrives, or any other machine-readable storage medium wherein, when theprogram code is loaded into and executed by a machine, such as acomputer, the machine becomes an apparatus for practicing the presentlydisclosed subject matter. In the case of program code execution onprogrammable computers, the computing device generally includes aprocessor, a storage medium readable by the processor (includingvolatile and non-volatile memory and/or storage elements), at least oneinput device, and at least one output device. One or more programs mayimplement or utilize the processes described in connection with thepresently disclosed subject matter, e.g., through the use of anapplication programming interface (API), reusable controls, or the like.Such programs may be implemented in a high level procedural orobject-oriented programming language to communicate with a computersystem. However, the program(s) can be implemented in assembly ormachine language, if desired. In any case, the language may be acompiled or interpreted language and it may be combined with hardwareimplementations.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

1. A method for automated services in a geographically-distributedcloud-based contact center, the method comprising: receiving, by thecontact center, a speech communication from a customer; converting, bythe contact center, the speech communication to text to performinference processing on the text to determine a customer intent;automatically analyzing, by the contact center, the text to determine asubject of the speech communication and key terms associated with thesubject; automatically parsing, by the contact center, a knowledgebaseusing the key terms for at least one responsive answer associated withthe subject; and providing, by the contact center, the solution to anagent in a unified interface during the communication with the customer.2. The method of 1, further comprising: querying a customer relationshipmanagement (CRM) platform/a customer service management (CSM) platformusing the key terms; and displaying responsive results from the CRM/CSMin the second field in the unified interface.
 3. The method of claim 1,further comprising: querying a database of customer-agent transcriptsusing the key terms; and displaying responsive results from the databaseof customer-agent transcripts in the second field in the unifiedinterface.
 4. The method of claim 1, wherein the method is performed inreal-time as the customer communication progresses with the agent. 5.The method of claim 1, further comprising concurrently displaying to theagent the text in a first field of the unified interface and thesolution in a second field of the unified interface.
 6. The method ofclaim 1, further comprising separating the customer voice from the agentvoice.
 7. The method of claim 6, further comprising applyingunsupervised methods to the separated customer voice.
 8. The method ofclaim 7, wherein the unsupervised methods comprise applying biometricsto authenticate the customer, predicting a customer gender, predicting acustomer age category, predicting a customer accent, and predictcustomer demographics.
 9. The method of claim 6, further comprisingextracting features associated with the customer.
 10. The method ofclaim 9, wherein the features comprise pain, agony, empathy, sarcasm,speech speed, tone, frustration, enthusiasm, interest and engagement.11. A geographically-distributed cloud-based software platformcomprising: one or more computer processors; and one or morecomputer-readable mediums storing instructions that, when executed bythe one or more computer processors, cause the cloud-based softwareplatform to perform automated-services operations comprising: receiving,by the contact center, a speech communication from a customer;converting, by the contact center, the speech communication to text toperform inference processing on the text to determine a customer intent;automatically analyzing, by the contact center, the text to determine asubject of the speech communication and key terms associated with thesubject; automatically parsing, by the contact center, a knowledgebaseusing the key terms for at least one responsive answer associated withthe subject; and providing, by the contact center, the solution to anagent in a unified interface during the communication with the customer.12. The cloud-based software platform of 11, further comprisinginstructions to cause operations comprising: querying a customerrelationship management (CRM) platform/a customer service management(CSM) platform using the key terms; and displaying responsive resultsfrom the CRM/CSM in the second field in the unified interface.
 13. Thecloud-based software platform of claim 11, further comprisinginstructions to cause operations comprising: querying a database ofcustomer-agent transcripts using the key terms; and displayingresponsive results from the database of customer-agent transcripts inthe second field in the unified interface.
 14. The cloud-based softwareplatform of claim 11, wherein the operations are performed in real-timeas the customer communication progresses with the agent.
 15. Thecloud-based software platform of claim 11, further comprisinginstructions to cause operations comprising concurrently displaying tothe agent the text in a first field of the unified interface and thesolution in a second field of the unified interface.
 16. The cloud-basedsoftware platform of claim 11, further comprising instructions to causeoperations comprising separating the customer voice from the agentvoice.
 17. The cloud-based software platform of claim 16, furthercomprising instructions to cause operations comprising applyingunsupervised methods to the separated customer voice.
 18. Thecloud-based software platform of claim 17, wherein the unsupervisedmethods comprise applying biometrics to authenticate the customer,predicting a customer gender, predicting a customer age category,predicting a customer accent, and predict customer demographics.
 19. Thecloud-based software platform of claim 16, further comprisinginstructions to cause operations comprising extracting featuresassociated with the customer.
 20. The cloud-based software platform ofclaim 19, wherein the features comprise pain, agony, empathy, sarcasm,speech speed, tone, frustration, enthusiasm, interest and engagement.